Identifying Cognate Sets Across Dictionaries of Related Languages
نویسندگان
چکیده
We present a system for identifying cognate sets across dictionaries of related languages. The likelihood of a cognate relationship is calculated on the basis of a rich set of features that capture both phonetic and semantic similarity, as well as the presence of regular sound correspondences. The similarity scores are used to cluster words from different languages that may originate from a common protoword. When tested on the Algonquian language family, our system detects 63% of cognate sets while maintaining cluster purity of 70%.
منابع مشابه
On multiword lexical units and their role in maritime dictionaries
Multi-word lexical units are a typical feature of specialized dictionaries, in particular monolingual and bilingual maritime dictionaries. The paper studies the concept of the multi-word lexical unit and considers the similarities and differences of their selection and presentation in monolingual and bilingual maritime dictionaries. The work analyses such issues as the classification of multi-w...
متن کاملObtaining SMT dictionaries for related languages
This study explores methods for developing Machine Translation dictionaries on the basis of word frequency lists coming from comparable corpora. We investigate (1) various methods to measure the similarity of cognates between related languages, (2) detection and removal of noisy cognate translations using SVM ranking. We show preliminary results on several Romance and Slavonic languages.
متن کاملConstraint-Based Bilingual Lexicon Induction for Closely Related Languages
The lack or absence of parallel and comparable corpora makes bilingual lexicon extraction becomes a difficult task for low-resource languages. Pivot language and cognate recognition approach have been proven useful to induce bilingual lexicons for such languages. We analyze the features of closely related languages and define a semantic constraint assumption. Based on the assumption, we propose...
متن کاملEffect of Cognate-Based Instruction Strategy on Vocabulary Learning Among Iranian EFL Learners
Cognates are the words celebrating their similarities from phonetic, orthographic, and semantic points of view across two or more languages. The aim of the present study was to investigate the effect of cognate-based instruction strategy on vocabulary learning among Iranian EFL learners. To achieve the goal of the study, 80 EFL learners (15-27 years old) took part in the study; all of them were...
متن کاملEffect of Cognate-Based Instruction Strategy on Vocabulary Learning Among Iranian EFL Learners
Cognates are the words celebrating their similarities from phonetic, orthographic, and semantic points of view across two or more languages. The aim of the present study was to investigate the effect of cognate-based instruction strategy on vocabulary learning among Iranian EFL learners. To achieve the goal of the study, 80 EFL learners (15-27 years old) took part in the study; all of them were...
متن کامل